Characterizing the Shine-Dalgarno Motif: Probability Matrices and Weight Matrices
نویسندگان
چکیده
Methods for identifying biologically significant k-mers by exhaustive evaluation (k ≤ 10) are applied to the pooled Upstream Regions (USR) of all 4289 E. coli ORFs. Instances of the ShineDalgarno (SD) site are readily identified using these methods. Using these motif instances as starting points, two motif representations and training methods, probability and weight matrices, are applied to characterize the complete SD motif. Despite using different representations and objective functions, both methods yield approximately the same motif characterization, providing evidence for the robustness of the result and the effectiveness of the methods. By these measures, about 1/4 of the ORFs have no better than random SD sites.
منابع مشابه
Listeria Monocytogenes La111 and Klebsiella Pneumoniae KCTC 2242: Shine-Dalgarno Sequences
Listeria monocytogenes can cause serious infection and recently, relapse of listeriosis has been reported in leukemia and colorectal cancer, and the patients with Klebsiella pneumoniae are at increased risk of colorectal cancer. Translation initiation codon recognition is basically mediated by Shine-Dalgarno (SD) and the anti-SD sequences at the small ribosomal RNA (ssu rRNA). In this research,...
متن کاملLearning Weight Matrices for Identifying Regulatory Elements
The structure of DNA regulatory patterns is partially understood, revealing an indeterminacy in the base composition. The dominant approach for representing this intrinsic variability is probability matrices, although some have used IUPAC codes and restricted regular expression languages [1]. In general the goal is to identify patterns that are distinguished from the background, where the backg...
متن کاملRefining Probability Motifs for the Discovery of Existing Patterns of DNA Bachelor Project
The aim of this project was to build a probability motif refining program. In the past this process has been both too computationally demanding and time consuming to be a feasible tool in the world of Bioinformatics. The notion is to take a file of DNA sequences and containing hidden motifs and apply a set of given position specific weight matrices to these sequences in order to discover the in...
متن کاملEvaluating Representations for the Shine-Dalgarno Site in Escherichia coli
Several methods for identifying individual motif instance by exhaustive evaluation of k-mers (k ≤ 10) are applied to the pooled Upstream Regions (USR) of all 4289 Escherichia coli ORFs. Instances of the Shine-Dalgarno (SD) site are readily identified using these methods. Using these motif instances as starting points, various motif representations and training methods, including several new alg...
متن کاملامید ریاضی نرخ پوشش برای ماتریسهای هلمن
Hellman’s time-memory trade-off is a probabilistic method for inverting one-way functions, using pre-computed data. Hellman introduced this method in 1980 and obtained a lower bound for the success probability of his algorithm. After that, all further analyses of researchers are based on this lower bound. In this paper, we first studied the expected coverage rate (ECR) of the Hellman matrice...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002